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(g) High-speed packet switch. 

<g?> A packet switch of the type in which packets 
® received in the switch are stored in memory 
CnT they are output. In the switch fabric of the 
si ^ packets are serially received in input 
shift registers (401) wide enough to store an 
entire packet, output in parallel to memory (407) 
which is as wide as the input shift renter, 
moved in parallel in the memory, and output in 
paraSel to an output shift register (405 The bus 
?4M) connecting the input shift registers, the 
output shift register, and the memory is as wide 
as the input shift register, but does not cross the 
boundaries of the semiconductor chips making 
up the switch fabric, thus avoiding the electrical 
problems of very wide buses. In the disclosed 
implementation, there are 14 input lines and 14 
output lines. A switch memory is assocated 
with each output line and receives packets from 
al 14 input lines, accepting only those destined 
for the output line associated with the input 
line. Each switch memory includes a controller 
memory and a communications interface .foi "the 
controller, and a set of switch memory VLSI 
Sees Each switch memory VLSI device in- 
cludes a first shift register for receiving slices of 
the packet and a bus, a memory and a second 
shift register for outputting the slices. The bus, 
the memory, and the second shift register are as 
wide as the first shift register. 
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Field of the Invention 

The invention concerns packet switching generally 
and packet switching in very high-speed networks in 
particular. 

Description of the Prior Art 

Increasingly, devices communicate with each other 
by sending packets of digital data. As shown in FIG. 
1 , a packet (P 1 1 3) is a sequence of bits which one de- 
vice sends another. Packets generally have two 
parts: a message (MSG 115), which is the sequence 
of bits which make up the actual message which is be- 
ing sent, and a header (HDR 117), which contains 
control information which the communications sys- 
tem over which the packet is sent uses in transferring 
the message. At a minimums header 117 will contain 
destination (D 119), a value which indicates the des- 
tination to which the message is directed. Header 117 
may also contain information such as the length of 
message 1 1 5, the type of message 1 1 5, or the source 
of message 115. The manner in which destination 1 1 9 
indicates the destination will depend on the kind of 
communications system over which the message is 
sent. For example, destination 119 may specify the 
address of the device which is to receive the packet 
in the network or it may specify a virtual circuit which 
is currently connecting the source of the packet with 
its destination. 

One way of sending packets between devices is 
by means of a switched network. A switched network 
is made up of nodes connected by incoming and out- 
going links. A packet switch at each node of the net- 
work receives packets on its incoming links, determi- 
nes from destination 119 where each packet is going, 
and switches the packet to the outgoing link which will 
take it to its destination. 

FIG. 1 shows a prior-art packet switch 1 01 . Pack- 
et switch 1 01 receives packets from a number of input 
links at input ports (IP) 107(0.. n), switches the pack- 
ets in switch fabric (SF) 103, and outputs them to out- 
put links at output ports (OP) 109(0..n). In most com- 
mon kinds of prior-art packet switches, switching fab- 
ric 103 includes memory 105. Memory 105 contains 
at a minimum an output queue (OQ) 111 for each out- 
put port 109. Switch fabric 103 does the switching by 
placing each packet received on an input port 107 
onto the tail of output queue 111 for output port 109 
for the output link which will take the packet to its des- 
tination. As switch fabric 103 adds incoming packets 
1 1 3 to the tails of the output queues, it takes outgoing 
packets 113from the heads of the output queues 111 
and provides them to output ports 109 corresponding 
to the queues 111. 

A problem with packet switches in modern net- 
works is the switch is often unable to keep up with the 
speed at which the links can operate. For example, a 



glass fiber link can operate at speeds of 1 Giga- 
byte/second. If the switch cannot keep up with that 
speed, that is, unless the switch can handle simulta- 
neous inputs from a number of links operating at that 

5 speed, the network will not be able to use the full ca- 
pacity offered by the links, but will instead be limited 
by the rate at which the switches can transfer packets 
from one link to another. A survey of architectures for 
fast packet switches may be found in Fouad A. Tobagi, 

w "Fast Packet Switch Architectures for Broad-band In- 
tegrated Services Digital Networks", Proceedings of 
the IEEE, vol. 78, No. 1, January, 1990. 

A particularly troublesome area in designing 
packet switches of the type of packet switch 101 is 

15 the limitations that memory 105 places on operating 
speed. Modern packet switches are of course made 
up of integrated circuits; in particular, memory 105 is 
made up of a number of off-the-shelf dynamic RAM 
integrated circuits. Memory 105 must be both large 

20 and fast; however, off-the-shelf dynamic RAM inte- 
grated circuits are either large and slow or small and 
fast. For example, current CMOS memories of 256 
Kbits have a cycle time of 35ns, while those of 4 Mbits 
have a cycle time of 200 ns. Worse, the product of 

25 memory size with cycle time has remained remark- 
ably constant over generations of memory technolo- 
gies. 

If switch 101 is to operate at the necessary 
speed, enough fast memory integrated circuits must 

30 be provided so that memory 105 is wide enough to 
store an entire packet in a single row of the memory, 
so that the entire packet can be written to or read from 
memory 105 in a single operation. Further, the packet 
must be carried to and from the memory by a bus 

35 which is as wide as the memory. However, as a bus 
becomes wider, it also becomes slower. The large 
number of parallel lines increases distortion and skew 
in the data signals, and the bus cycle time must be in- 
creased to counteract these effects. Further, cross- 

40 talk between the many data lines increases the noise 
level, so that larger bus drivers and more sensitive re- 
ceivers are required. As a result of all of these factors, 
the speed advantages gained by a wide memory 105 
are limited by the slowness of the bus which connects 

45 the memory integrated circuits making up memory 
105 to the remaining integrated circuits making up 
switch fabric 103 and switch 101 cannot provide pro- 
vide the gigabytes/second transfer rate which is 
needed. 

so It is an object of the high-speed packet switch dis- 

closed in this patent application to overcome the fore- 
going problems and limitations of prior-art packet 
switches. 



A switch fabric according to the invention includes 
• packet receiving means coupled to a plurality 
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of input ports for serially receiving the packets 
from the input ports; 

• packet outputting means coupled to a plurality 
of output ports for serially outputting the pack- 
ets to the output ports; 

• memory means for storing the packets; and 

• transfer means for transferring the packets in 
parallel between the packet receiving means, 
the packet outputting means, and the memory 
means, the switch fabric being fabricated in 
one or more integrated circuits such that the 
transfer means remains within the boundaries 
of the integrated circuits. 

Embodiments of the invention include the following; 

• A packet switch which employs the above 
switch fabric. 

• A VLSI integrated circuit which implements a 
slice of the above switch fabric. 

• a switch fabric in which there is a plurality of" 
packet receiving means and each of the packet 
receiving means receives packets at a rate 
which is independent of the rates at which the 
other packet receiving means receive the 
packets. In this switch fabric, there is also a 
plurality of packet outputting means, and each 
of them outputs packets at a rate which is in- 
dependent of the rates at which the other pack- 
et outputting means output the packets. 

• A switch fabric in which there are packet trans- 
fer means for moving packets in parallel be- 
tween the packet receiving means, the packet 
outputting means, and the packet memory 
means and moving means for moving the pack- 
ets in parallel between the locations in the 
packet memory means. 

The foregoing and other objects and advantages of 
the invention will be apparent to one of ordinary skill 
in the art who peruses the following Drawing and De- 
tailed Description, wherein: 

Brief Description of the Drawing 

FIG. 1 is a block diagram of a prior-art packet 
switch which employs memory; 
FIG. 2 is a block diagram of a packet switch which 
is built according to the principles of the inven- 
tion; 

FIG. 3 is a block diagram of the switch memory 
component of the packet switch of FIG. 2; 
FIG. 4 is a block diagram of the data paths of a 
switch memory VLSI 303; 
FIG. 5 is a block diagram of the functional com- 
ponents of switch memory VLSI 303; 
FIG. 6 is a diagram of the division of nybble mem- 
ory 407 of switch memory VLSI 303 into queues; 
and 

FIG. 7 is a diagram of the arrangement of the 
chief components of switch memory VLSI 303 in 



the VLSI. 

The reference numbers employed in the Drawing 
and the Detailed Description have three or more dig- 

5 its. The two least significant digits are a number with- 
in a figure; the remaining digits are the figure number. 
Thus, the element with the reference number "305" is 
first shown in FIG. 3. Further, individual ones of re- 
peated elements are represented by means of num- 

10 bers or letters in parentheses following the reference 
number of the element. Thus, 303(0) means the first 
one of element 303, while 3030 means any one of the 
elements 303. 

75 Detailed Description 

The "Detailed Description" will describe a high-speed 
packet switch which has been implemented according 
to the principles of the invention. The description will 

20 begin with an overview of the packet switch, will then 
describe the switch memory component of the packet 
switch, and will finally describe a novel very large 
scale integrated circuit (VLSI) employed in the switch 
memory component. In the following, the terms serial 

25 and parallel are used as follows: a set of portions of 
a packet is processed serially when the portions be- 
longing to the set are sequentially processed in the 
order in which the portions occur in the packet; the 
set of portions is processed in parallel when all of the 

30 portions in the set are processed at once. 

Overview of the High-Speed Packet Switch: 
FIG. 2 . 

35 FIG. 2 provides an overview of a high-speed 

packet switch 201 which is implemented according to 
the principles of the invention. High-speed packet 
switch 201 receives packets simultaneously from 14 
serial input links (IL) 203 (0..1 3) and provides them si- 

40 multaneously to 14 serial output links (OL) 221 
(0..13). The packets travel in the input links 203 and 
output links 221 as a serial sequence of single bits, as 
indicated by the "1" notation on the links. Because 
packets are processed simultaneously, there is con- 

45 sequently an input portion (IP) 202(i) of packet switch 
201 corresponding to each input link 203(i) and an 
output portion (OP) 21 0(i) corresponding to each out- 
put link 221 (i). The input portions 202 are coupled to 
the output portions 210 by means of broadcast bus 

so 207; broadcast bus 207 carries packets received by 
each of the input portions 203 to all of the output por- 
tions 210; as will be explained in more detail later, out- 
put portion 2100) corresponding to an output link 
221 (j) only accepts packets which will reach their des- 

55 tinations via output link 2210). This interaction be- 
tween broadcast bus 207 and output portions 210 
thus enables switch 201 to switch packets as required 
by the packet's destination 109. As is apparent from 
the foregoing, switch 201 may be implemented with 



3 



5 



EPO 569 173A2 



6 



more or less than 14 input and output links and there 
may even be more or fewer input links than output 
links. 

Since each of the input portions 202 and output 5 
portions 210 is substantially identical to the others, 
the remainder of the discussion of FIG. 2 will deal with 
a single input portion 202(h) and a single output por- 
tion 21 0(i). Receiver 205(h) in input portion 202(h) re- 
ceives packets serially from a fiber optic link and out- 10 
puts each packet it receives as a serial sequence of 
32 bit words on a 32-bit bus 206(h) which is connected 
to router 208(h) and to broadcast bus 207. Receiver 
205(h) is made of two standard devices: AT&T's 
ODL200 part converts between electrical and optical is 
signals and Advanced Micro Devices' TAXI receiver 
chip (part number AM7969) converts the single-bit 
serial input stream into a serial stream of 32-bit words. 
Router 208(h) watches for headers 117 and packet 
ends. When router 208(h) detects a header 117, it 20 
copies destination 119 from the header 117 and de- 
termines from destination 119 and a table in router 
208(h) which relates destinations to output links what 
the proper output link 221 is for the packet. Then, 
when it detects the end of the packet, it appends an 25 
output link specifier (OLS) 223 to the end of the pack- 
et Routing functions are described in detail in J. D. 
Bertsekas and R. Gallagher, Data Networks, Prentice 
Hall, 1987. 

In the preferred embodiment, outputlink specifier 30 
223 is two 32-bit words. Bits 28-31 of each of the two 
32-bit words specify the output link. There are three 
possibilities: 

• Output to a single output link 221 (j): bits 28-31 

of the first 32-bit word have the value 8, ex- 35 
pressed hexadecimally; bits 28-31 of the sec- 
ond 32-bit word have the number "j" (i.e., a 
number between 0 and 13), expressed hexa- 
decimally, 

• Output to all output links 221(0.. 13): bits 28-31 40 
of both words have the hexadecimal value 0. 

• Output to a group of output links 221: bit 28 of 
the first word has the value 0 (in the hexade- 
cimal value 8, this bit has the value 1); bits 29- 

31 of the first word and 28-31 of the second 45 
word have a seven-bit group code specifying a 
group of output links. 
As the sequence of 32-bit words which contain the 
packet are output to broadcast bus 207, they are re- 
ceived and stored in each of the switch memories 211 . 50 
At the same time, each of the other input portions 202 
in switch 201 may be receiving packets and outputting 
32-bit words to broadcast bus 207, and each switch 
memory 211 stores the 32-bit words which it receives 
from those input portions 202 as well. 55 

When output link specifier 223 for a packet ar- 
rives in the switch memories 211, each switch mem- 
ory 211 (i) determines from output link specifier 223 
whether the packet is to be output on output link 



221(i). If not, switch memory 211 (i) simply discards 
the stored packet; if it is, switch memory 211(i) trans- 
fers the stored packet (without the output link speci- 
fier) to output queues 213(i), a memory which con- 
tains one or more output queues of packets to be out- 
put via output link 221 (i). 

Switch memory 21 1 (i) services the queues in out- 
put queues 213(i) in a programmed order. The packet 
at the head of the queue currently being serviced is 
output as a sequence of 32-bit words on bus 215(i). 
Chopper 217(i) reads header 117 to determine the 
length of the packet and counts the bytes of the pack- 
et as they are transmitted on bus 21 5(i). When the last 
byte has been transmitted, chopper 21 7(i) signals the 
end of the packet to switch memory 211(i) and trans- 
mitter 21 9(i). As transmitter 2 19(i) receives the 32-bit 
words on bus 215(i), it transmits the bits in the words 
sequentially onto output link 221 (i). It ceases transmit- 
ting in response to the signal from chopper 217(i). 
Switch memory 21 1 (i) responds to the same signal by 
placing the first 32 bits of the next packet to be trans- 
mitted on bus 215(i). 

Detail of a Switch Memory 211(i): FIG. 3 

FIG. 3 is a detailed block diagram of a single 
switch memory 211(i) in a preferred embodiment. 
Switch memory 211(i) includes 12 separate integrat- 
ed circuits: 8 switch memory VLSIs (SWMV) 303 
(0..7), a switch memory CPU IC 307, a boot ROM IC 
309, a static RAM IC 311, and a communications IC 
313. IC's 307, 309, 311 , and 31 3 are connected to the 
8 switch memory VLSIs by CPU bus 305. CPU IC 307 
is a MIPS R3000-based microcontroller, part number 
IDT3052, manufactured by Integrated Device Tech- 
nology. 

As will be explained in more detail below, SWM 
CPU 307 controls the operation of switch memory 
211 (i) by setting parameters in the 8 switch memory 
VLSI's and moving packets among-the queues in out- 
put queues 213(i). Additionally, SWM CPU 307 can 
read and write portions of packets stored in output 
queues 213(i). Boot ROM 309 contains code which is 
executed by SWM CPU 307 to put switch memory 
211 (i) into a condition in which it can commence op- 
eration; static RAM 311 contains programs and data 
by means of which CPU 307 controls switch memory 
211 (i). CPU 307 can transfer data between SRAM 311 
and switch memory VLSI's 303(0..7). Communica- 
tions link 313, finally, permits communication with 
CPU 307 by means of an RS 232 link 315. Using RS 
232 link 315, data can be transferred between CPU 
307 and the outside world, and thereby between 
SRAM 311 and switch memory VLSIs 303(0..7) and 
the outside world. 

The switch memory VLSIs 303(0.. 7) are a slice 
implementation of switch memory 211. Slice imple- 
mentations of components are made up of sets of 
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identical integrated circuits which operate in parallel. 
Each of the integrated circuits processes a slice of the 
input to the component. In the case of switch memory 
VLSIs 303(0.. 7), the implementation is a nybble slice 
implementation: each VLSI 303(j) processes 4 bits (1 
nybble) of the 32 bit inputs received from broadcast 
bus 207 and the 32 bit outputs made to bus 21 5(i). Be- 
ginning with the inputs, a switch memory VLSI 3030) 
receives 4 bits via broadcast bus 207 from each of the 
14 input portions 202(0.. 13). As shown in FIG. 3, 
switch memory VLSI 303(0) receives bits 0..3 from in- 
put portions 202(0) through 202(13), VLSI 303(1) re- 
ceives bits 4..7 from those input portions, and so on 
through VLSI 303(7), which receives bits 28-31 from 
those input portions. Thus, the 8 switch memory 
VLSIs 303(0.. 7) together receive each 32-bit word 
output by output portions 202(0.. 13) on buses 
206(0.. 13). Similarly, each switch memory VLSI 
303(j) outputs 4 bits to bus 215(i). Switch memory 
VLSI 303(0) outputs bits 0..3 of bus 215(i), switch 
memory VLSI 303(1) outputs bits 4.. 7, and so on 
through switch memory VLSI 303(7), which outputs 
bits 28..31. 

Each switch memory VLSI 303(j) also contains a 
set of queues (SWMVQ) 301 0). All of the queues 
301(0..7) together make up output queues 213(i). 
Each queue 301 (j) contains a four-bit slice of each 32- 
bit word stored in the queues in output queues 21 3(i). 
Thus, queue 301(0) will have bits 0..3 of each word, 
queue 301(1) will have bits4..7, and soon. Of course, 
a packet is made up of many 32-bit words; conse- 
quently, for a packet 11 3(k) in output queues 213, bits 
0..3 of each word in packet 113(k) will be stored in 
queues 301(0), bits 4.. 7 of each word in queues 
301(1), and so forth. 

Main Data Paths in Switch Memory VLSI 303(k ) 

FIG. 4 gives a schematic overview of the main 
data paths in a single switch memory VLSI 303(k). 
The data paths connect 14 input shift registers (ISR) 
401 with nybble memory 407 containing queues 
301 (k) and two output shift registers (OSR) 405. In a 
preferred embodiment each of the input shift regis- 
ters and output shift registers is 512 bits wide, as is 
nybble memory 407, which is organized as 512 512- 
bit words. Nybble memory 407, the input shift regis- 
ters 401, and the output shift registers 405 are con- 
nected by a nybble bus 403, which is also 512 bits 
wide. Because nybble bus 403 is completely con- 
tained on VLSI 303(k), it is very short and has none 
of the electrical problems associated with very wide 
buses which connect separate integrated circuits. 

As previously explained, each input portion 202 
receives packets 113 as a sequence of bits from input 
link 203 and outputs a sequence of 32-bit words con- 
taining the packet's bits to broadcast bus 207. Fur- 
ther, each VLSI 303(k) receives sequences of four-bit 



slices m..n, one slice coming from each 32-bit word of 
the sequences of words output by input portions 
202(0.. 13). The sequences of slices are received in 

5 the 14 input shift registers (ISR) 401(0..13), one cor- 
responding to each of the input portions 202. Each in- 
put shift register is long enough to store slices of all 
of the 32-bit words making up the packet. Thus, the 
packet switch of the preferred embodiment can han- 

w die packets 113 with lengths up to 128 32-bit words. 
Input shift registers 401 are further double-buffered, 
so that they can begin receiving nybbles of another 
packet from their corresponding input portion 202 
while waiting to output a packet they have just re- 
ts ceived to nybble bus 403. Similarly, VLSI 303(k) pro- 
vides a four-bit slice m..n of each 32-bit word of each 
packet output to bus 215(i). These slices are output 
from one of output shift registers (OSR) 405(0.-1) with 
the other being retained as a spare in case of mal- 

20 function. 

Since both nybble bus 403 and nybble memory 
407 are 512 bits wide, all of the slices in shift register 
401 are written in a single operation to a row of nybble 
memory 407. Similarly, when output link 215(i) asso- 

25 ciated with switch memory 211(i) is ready to output a 
packet the row of nybble memory 407 containing the 
nybble slices for the packet is read in parallel to the 
currently operating output shift register 405, which 
then provides the slices a nybble at a time to bits m..n 

30 of bus 215(i). Additionally, bits m..n of CPU bus 305 
are received in nybble memory 407. This arrange- 
ment permits CPU 307 to read and write words of 
packets 1 33 stored in switch memory 211(i). 

Continuing with details about input shift registers 

35 401 and output shift registers 405, each input shift 
register 401 (k) is controlled by three control signals: 

• ICLK 409(k) is a clock signal from input portion 
202(k) corresponding to input shift register 
401 (k) which times the input of nibbles to shift 

40 register 401 (k). 

• IPKT 411 (k) is a signal from router 207(k) indi- 
cating that router 207(k) has detected the end 
of the packet from which shift register 401 (k) is 
currently receiving nibbles. 

45 • 1REQ 41 3(k) is a signal indicating that the nyb- 

bles in shift register 401 (k) are to be output to 
nybble bus 403. The manner in which it is gen- 
erated will be described in more detail below. 
Input shift register 401 (k) outputs its contents to nyb- 
50 ble bus 403 when IPKT 411 (k) indicates that the end 
of a packet has been reached and IREQ 413(k) indi- 
cates that the output link 221 corresponding to switch 
memory 211(i) containing switch memory VLSI 
303(k) is to receive the packet. 
55 IREQ 413(k) is generated by switch memory 

VLSI 303(7) as follows: As is apparent from FIG. 3, 
switch memory VLSI 303(7) receives bits 28..31 of 
each word input from broadcast bus 207. At the end 
of each packet is the two-word output link specifier 
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223; as previously described, in output link specifier 
223, bits 28..31 of each word together specify a set 
of one or more output links. In the preferred embodi- 
ment, IPKT signal 411 (k) received in an input shift reg- s 
ister 401 (k) signals the end of the packet being sent 
on broadcast bus 207(k) and also that the next two 
words on broadcast bus 207(k) will be output link spe- 
cifier Switch memory VLSI 303(7) consequently re- 
sponds to IPKT signal 411(k) by examining bits 28.. 31 w 
of the next two words received on broadcast bus 
207(k). If these bits indicate that switch memory 
211(f) to which switch memory VLSI 303(7) belongs 
is to receive the packet, VLSI 303(7) in switch mem- 
ory 211 (i) generates IREQ 41 3(0.. 1 3) for itself and all is 
of the other VLSIs 303. Any input shift register 401 (k) 
which has just received an IPKT signal 411(k) will re- 
spond to IREQ 413 by outputting its contents to nyb- 
ble bus 403. 

Each output shift register 405(1) is controlled by 20 
four signal inputs: 

• OCLK 415(1) is a timing signal received from 
transmitter 21 9(i) which is receiving words from 
switch memory 211 (i) to which output shift reg- 
ister405(l) belongs; it controls the rate atwhich 25 
output shift register 405 outputs nybbles. 

• OPKT 417(1) is an end-of-packet signal re- 
ceived from chopper 21 7(i) which indicates that 
the entire packet has been output by switch 
memory 211 (i). ' 30 

• RDY 419(1) is a ready signal provided to trans- 
mitter 21 9(i) which indicates that output shift 
register 405(1) is ready to provide data. 

• OREQ 421 (I) is a signal provided to output shift 
register 405(1) indicating that it is to load the 35 
data presently on nybble bus 403. 

As can be seen from the foregoing list of control sig- 
nals, output shift register 405(1) outputs a serial se- 
quence of nybbles until it receives OPKT 417(1); at 
that point it indicates via RDY 41 9(l) that it is no longer 40 
ready to transmit nybbles; when shift register 405(1) 
receives OREQ 421(1), the 512-bit word of nybble 
memory 407 which is next to be output from shift reg- 
ister 405(1) is on nybble bus 403 and shift register 
405(1) responds by loading that word from nybble bus 45 
403. Once the word is loaded, shift reg ister 405(1) sets 
RDY 419(1) to indicate that it is ready and begins 
transmitting nybbles in response to output clock 
415(1). 

A particular advantage of the fact that each input so 
shift register 401 receives a separate ICLK signal 409 
and each output shift register 405 receives a separate 
OCLK signal 41 5 is that there is no need for input por- 
tions 202 to be synchronized with nybble memory 407 
or with each other and similarly no need for transmit- 55 
ter 219 to be synchronized with nybble memory 407 
or with other transmitters 219. Indeed, it is even pos- 
sible for different input portions 202 and different 
transmitters 219 to operate at different rates. 



Details of Switch Memory VLSI 303{k): FIGs. 5-6 

FIG. 5 provides further details of a preferred em- 
bodiment of switch memory VLSI 303(k). Beginning 
with nybble memory 407, input shift registers 401, 
and output shift registers 405, in the preferred em- 
bodiment, nybble memory 407 is implemented as four 
1 28 x 512 bit planes 407(0..3). The first bit of a given 
nybble is stored at a position in plane 407(0) and the 
second, third, and fourth bits are stored at the corre- 
sponding positions in planes 407(1 ..3). Nybble mem- 
ory 407 receives data from and outputs data to fast 
copy latch 535, which is made up of four 128 x 1 bit 
planes. Fast copy latch 535 permits reading from one 
512-bit word of nybble memory 407 and writing to an- 
other such word in a single read cycle followed by a 
single write cycle. The read cycle and write cycle are 
treated as an atomic operation. 

FIG. 5 shows only one input shift register 401, 
namely ISR 401 (0), and only one output shift register 
405, namely OSR 405(1). Each input shift register 
401 and output shift register 405 is implemented as 
four 128 x 1 bit planes, with the bits of a given nybble 
stored in the shift register being located at corre- 
sponding positions in the four planes. In the case of 
input shift registers 401, each plane is connected to 
one of the four lines of broadcast bus 207 which pro- 
vides nybbles to the input shift registers and in the 
case of output shift registers 405, each plane is con- 
nected to one of the four lines of bus 215 which pro- 
vide nybbles to transmitter 21 9(i). Further, nybble bus 
403 is connected between the four planes of the shift 
registers and the four planes of fast copy latch 535. 

FIG. 5 further shows CPU interface 501 of switch 
memory VLSI 303(k). CPU interface 501 is a set of 
registers which SWM CPU 307 may read and write via 
CPU bus 305. The values written to these registers by 
CPU 307 control the operation of switch memory 
211(i) to which the switch memory VLSI 303(k) be- 
longs. The sets of registers in all of the switch memory 
VLSIs (0..7) in a switch memory 211 (i) make up a sin- 
gle set of registers for the entire switch memory 
2110). Each of the individual switch memory VLSIs 
(0..7) contains 1 nybble of each register. 

The first group of registers, bearing the reference 
numbers 505 through 517, have the following func- 
tions: 

• SR EN 505 specifies which of output shift reg- 
isters 405(0) and (1) is to be enabled for output. 

• OFFSET 507 contains two values; the first in- 
dicates the point in input shift register 401 at 
which the first nibble of the packet being re- 
ceived is to be placed in inputshiftregister401; 
the second indicates the point in output shift 
register 405 at which the output of nybbles of 
the packet is to begin. 

• STATUS 509 is a set of bits which indicate the 
status of switch memory 211(j). 
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• INT ENAB 511 enables switch memory VLSIs 
303 to produce interrupts to which switch mem- 
ory CPU 307 responds. 

• PORT 513 contains four bits making up the 
hexadecimal digit which specifies the number 
of output link 221 to which switch memory 
211Q) belongs. 

• GP 515 contains 8 bits making up the group 
code indicating the group of output links 221 to 
which switch memory 211 (j) belongs. In deter- 
mining whether to accept a packet, switch 
memory 211(j) compares the contents of GP 
515 with the packet's group code and accepts 
the packet if any bit in its group register match- 
es the corresponding bit in the group code. 

• FIFO 517 indicates whether output queues 
213(i) are organized as a single queue, in 
which case switch memory CPU 307 plays no 
role in the operation of switch memory VLSI 
303(i),oras 16 queues, in which case CPU 307 
manages the 16 queues. 

Of the above registers, PORT 51 3, GP 515, and OFF- 
SET 507 require more discussion. Beginning with 
PORT 513 and GP 515, as previously pointed out, in 
a preferred embodiment, only VLSI 303(7) in a switch 
memory 211 (i) receives the relevant bits of output link 
specifier 223. VLSI 303(7) generates IREQ 413(k) to 
the other VLSI 303(i)'s when a comparison between 
output link specifier 223 and the values in PORT 513 
and GROUP 515 indicates that switch memory 211(i) 
is to retain the packet. The comparison is made by ad- 
dress comparator 533, which is connected to broad- 
cast bus 207(28,.32). Address comparator 533 is in- 
cluded in all VLSI 303(i)'s. but only the comparison in 
VLSI 303(7) results in the generation of IREQ 41 3(k). 
In a preferred embodiment, a bit (notshown) in control 
registers 503 is set to indicate which of the VLSIs 
303(0.. 7) is to generate IREQ 413(k). 

OFFSET 507 permits switch memory CPU 307 to 
"reserve space^ahead of a packet. When the nybbles 
of the packet are received in input shift registers 401 , 
they are placed in the input shift registers 401 begin- 
ning^ the location in the shift registers specified by 
the value in OFFSET 507. The entire contents of input 
shift register 401 are then moved as previously de- 
scribed into nybble memory 407, including the "emp- 
ty" portion of input shift register 401 ahead of the pos- 
ition indicated by the first value in OFFSET 507. 
Switch memory CPU 307 can then write information 
into that empty portion. Further, by setting the sec- 
ond value in OFFSET 507, switch memory CPU 307 
can determine how much of the contents of the pack- 
et and the "empty" portion are output to transmitter 
219. By setting values in OFFSET 507, switch mem- 
ory CPU 307 can perform operations such as adding 
a new header to a packet, deleting a header that is no 
longer needed, or adding information to a packet 
which is needed only while the packet is in the packet 



switch. 

The use of OFFSET 507 illustrates an important 
principle of the packet switch of the present invention: 

5 that the nybbles belonging to the packets, and there- 
fore the packets, are always moved in their entirety, 
be it between the shift registers and nybble memory 
407 or within nybble memory 407. Another feature of 
the packet switch which illustrates this principle is of 

10 course fast copy latch 535. 

The remaining registers in CPU interface 501 be- 
long to address array (ADDRA) 519, which contains 
address information about the 16 queues into which 
output queues 213(i) in switch memory 211(i) may be 

15 organized. Address array 51 9 will be discussed in de- 
tail together with the organization of switch memory 
211(i). 

Yet to be discussed is arbitration and memory 
control 531. It has two functions: managing the 

20" queues in nybble memory 407, which will be dis- 
cussed in connection with those queues, and arbitrat- 
ing access to nybble memory 407. As is apparentfrom 
FIG. 4, data flows between nybble memory 407 and 
any of 17 sources and destinations: the 14 input shift 

25 registers and the two output shift registers via nybble 
bus 403 and switch memory CPU 307 via CPU bus 
305. Arbitration and memory control arbitrates among 
these sources and destinations as follows: as regards 
the sources and destinations connected to nybble 

30 bus 403, the active output shift register 405 has the 
highest priority, followed by the 14 input shift regis- 
ters 401. If more than one input shift register 401 is 
contending, the input shift registers 401 are given ac- 
cess to the bus in round-robin fashion. Memory ac- 

35 cesses by CPU 307 via CPU bus 305 have the lowest 
priority. 

Continuing with the organization of output 
queues 21 3(i), FIG. 6 shows details of their organiza- 
tion. There are two modes, determined by the setting 

40 of FIFO register 517. Mode 617 shows the organiza- 
tion when FIFO register 517 indicates that output 
queues 213(i) is organized as a single queue. In that 
case, output queues 213(i) function as a single circu- 
lar output queue (SOQ) 618. Any packet received 

45 from the set of input shift registers 401 corresponding 
to a particular input portion 202 which is destined for 
output link 221 (i) is placed at the tail of single output 
queue 618, which is indicated by write pointer 618; 
meanwhile, the active output shirt register 405(i) 

so reads packets from the head of single output queue 
618. which is indicated by read pointer 621. As the 
writes and reads occur, pointers 619 and 621 are au- 
tomatically updated by hardware in arbitration and 
memory control 531. In this mode, switch memory 

55 VLSI 303s can operate without a switch memory CPU 
307. 

Mode 601 shows how switch memory CPU 307 
can organize output queues 213(i) and thereby the 
nybble memories 407 which make them up into 1 6 cir- 
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cular queues and a waiting area. The queues are the 
following: an input shift register queue (ISRQ) 603 for 
each of the 14 input shift registers 401 and an output 
shift register queue (OSRQ) 61 5 for each of the two 5 
output shift registers 405. Switch memory CPU 307 
employs waiting area (WA) 613 to transfer packets 
from one of the input shift register queues to one of 
the output shift register queues. While a packet is in 
waiting area 613, switch memory CPU 307 may read 10 
or modify the contents of the header or the message, 
as well as the contents of any "empty space" stored 
with the packet. 

Switch memory CPU 307 defines the locations 
and sizes of the 16 input shift register queues 603 and is 
output shift register queues by means of address ar- 
ray 519. There is an address array entry 521 for each 
of the 16 queues; each entry 521 contains four fields. 
Two of the fields define the boundaries of the queue's 
area of nybble memory 407: 20 

• base field (B) 525 contains base pointer 605 for 
the queue. The base pointer indicates the start 
of the queue's area in nybble memory 407; 

• limit field (L) 523 contains limit pointer 611 for 

the queue; the limit pointer indicates the end of 25 
the queue's area in nybble memory 407. 
These fields are set by switch memory CPU 307. 

The remaining two fields define the current head and 

tail of the queue. 

• read field (R) 527 contains read pointer 609, 30 
which indicates the address of the packet cur- 
rently at the head of the queue. 

• write field (R) 529 contains write pointer 607, 
which indicates the address of the next loca- 
tion at which a packet may be written to the 35 
queue. 

The read and write fields are updated automatically 
by arbitration and memory control 531 as packets are 
read from and written to the queues. 

Operation of nybble memory 407 in non-FIFO 40 
mode is as follows: when the input shift registers 401 
for an input portion 202 contain an entire packet, the 
input shift register 401 for the input portion in each of 
the switch memory VLSIs 303 indicates to control 531 
that it wishes to write to nybble memory 407; control 45 
531 responds by waiting until the arbitration logic 
gives that input shift register 401 access to nybble 
bus 403 and then writes the packet to the location in- 
dicated by write pointer 607 in the input shift register 
queue 603 corresponding to the input shift register. so 

Switch memory CPU 307 moves packets from 
the heads of the input shift register queues 603 (indi- 
cated by read pointers 609) to locations in waiting 
area 613; it then moves packets from the locations in 
waiting area 613 to the tail of output shift register 55 
queue 615 for the active output shift register 405; 
these moves are done using fast copy latch 535. 
When the active output shift register 405 has finished 
outputting the packet which it currently contains to 



bus 215, it signals control 531 that it is ready for the 
next packet, and control 531 places the packet at the 
head of output shift register queue 6 1 5 for active out- 
put shift register 405 on nybble bus 403, from whence 
it is loaded into active output shift register 405. 

An advantage of non-FIFO mode 601 is that it 
may be used to implement a packet network in which 
packets may have differing priorities. For example, 
the packets may move in virtual circuits, and a virtual 
circuit which is connecting real-time devices such as 
a TV transmitter and a TV receiver may have a higher 
priority than a virtual circuit which is connecting non- 
real-time devices such as the electronic mail pro- 
grams in two computer systems. Such a system may 
be implemented in non-FIFO mode 601 by employing 
switch memory processor CPU 307 to set up a high- 
priority and a low-priority queue in waiting area 613. 
. These queues are managed completely by CPU 307. 
CPU 307 then moves a packet from an input shift reg- 
ister queue 603, examines the packet's header to de- 
termine its priority, then, depending on the priority, 
places the packet in either the high-priority or the 
low-priority queue. CPU 307 may then move some 
number of packets from the high priority queue in 
waiting queue 613 to the tail of output shift register 
queue 615 before it moves any from the low priority 
queue to the tail of that queue, thus giving the high- 
priority packets access to output link 221 more often 
than the low-priority packets. 

Layout of Switch Memory VLSI 303(j): FIG. 7 

FIG. 7 shows the layout of a preferred embodi- 
ment of switch memory VLSI 303(j) in a 175-pin PGA 
package 701. The preferred embodiment is imple- 
mented using an 0.5 micron 2-level metal CMOS 
process. There are approximately 1.3 million devices 
on the chip, with most of them being used to imple- 
ment nybble memory 407 and the 16 shift registers 
401 and 405. Nybble memory 407, implemented as a 
static RAM cell array, is in the center of FIG. 7; at its 
top and leftside are the column decoders 705 and the 
row decoders 703 which address the cells of nybble 
memory 407. Above column decoders 705 are the 
two output shift registers 405; below nybble memory 
407 are nybble bus 403 and the 14 input shift regis- 
ters 401; packet selector 707 determines which of 
shift registers 401 is to output its contents to nybble 
memory 407, and arbitration and memory control 531 
performs the arbitration and memory control func- 
tions previously discussed. 

144 of the 175 pins in the package are used for 
the device; of these, there are 26 power and ground 
pins, 56 pins for data inputs to the input shift registers 
401, and 14 pins each for the ICLK, IPKT, and IREQ 
control inputs to the input shift registers. There are 
further 8 pins for data outputs from the output shift 
registers 405 and two pins each for the OCLK, OPDT, 
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RDY, and OREQ signals. The interface between 
switch memory CPU 307 and switch memory VLSI 
303(j) has 39 pins; four are for input and output of 
data, 21 are for addresses in output queues 213, and 5 
the remainder are for the following signals: 

• SYSCLK: the system clock for switch memory 
211; 

• RESET: reset the switch memory VLSI 303(j); 

• CS: select the switch memory VLSI 303(j); 10 

• RD: specify a read or a write operation on nyb- 
ble memory 407; 

• IACK: specify that data has been received by 
CPU 307 in a read operation; 

• BURST: specify a burst read or write operation; is 

• OE: enable outputs from the active output shift 
register 405; 

• DTACK: indicate that data has been received 
from CPU 307; 

• INT: interrupt to CPU 307. 20 
As is apparent from the foregoing set of signals, 
switch memory CPU 307 can read and write individual 
nybbles of packets stored in output queues 213. 

Conclusion 25 

The foregoing Detailed Description has disclosed to 
one of ordinary skill in the art how a packet switch 
may be constructed which avoids the speed problems 
associated with standard memory VLSIs and the 30 
electrical problems associated with very wide busses 
which cross chip boundaries. The packet switch of the 
invention includes switch memory VLSIs. Each switch 
memory VLSI contains memory which is wide enough 
to store 1 nybble from all of the words of the packet 35 
in parallel and a data path which is as wide as the 
memory and which connects the memory to shift reg- 
isters into which and from which the nybbles to be 
stored in the memory are transferred serially. Be- 
cause the memory is very wide and the data paths do 40 
not cross chip boundaries, the speed and electrical 
problems of the prior art are avoided, very high mem- 
ory bandwidths are achieved, and the packet switch 
can operate at a higher rate than prior-art packet 
switches with memory. 45 

As will be apparent to those of ordinary skill in the 
art, many other embodiments which incorporate the 
principles of the packet switch disclosed herein are 
possible. The preferred embodiment is designed for 
packet switching systems in which the packets may so 
have varying lengths; however, the principles of the 
invention are equally advantageous in systems in 
which the packets have a fixed length, such as sys- 
tems using the 53-byte ATM packets. 

Many additional embodiments are possible. For 55 
example, the switch memories may be organized to 
increase the number of input links, they may be or- 
ganized to receive words of different sizes, and they 
may be organized in the same fashion as interleaved 



memories. Further, individual switch memory VLSIs 
may receive or output slices which are larger or small- 
er than the nybbles employed in the preferred em- 
bodiment, and as feature sizes in integrated circuits 
decrease, both the data paths and the memories in 
the switch memory VLSIs may be made wider and 
other devices may be incorporated into the switch 
memory VLSIs. Additionally, while the preferred em- 
bodiment is implemented using CMOS technology, 
the techniques of the invention are in noway depend- 
ent on that technology. Finally, other embodiments 
may employ different organizations of the memory in 
the switch memory, as required for the queueing dis- 
cipline employed in the packet switch to which the 
switch memory belongs. Such organizations may be 
predefined, as is the case with the FIFO mode in the 
preferred embodiment, or they may be defined by a 
processor. 



Claims 

1. A switch fabric for switching packets character- 
ized by: 

packet receiving means (401 (0,0..15,13)) 
coupled to a plurality of input ports for serially re- 
ceiving the packets from the input ports; 

packet outputting means (405 (0,0.. 13,1)) 
coupled to a plurality of output ports for serially 
outputting the packets to the output ports; 

packet memory means (407 (0..13)) for 
storing the packets; and 

packet transfer means (403) for transfer- 
ring the packets in parallel between the packet 
receiving means, the packet outputting means, 
and the packet memory means, 

the switch fabric being fabricated in one or 
more integrated circuits (303) such that the trans- 
fer means remains within the boundaries of the 
integrated circuits. 

2. The switch fabric set forth in claim 1 character- 
ized in that: 

the switch fabric is subdivided into a plur- 
ality of switch memories (211), each switch mem- 
ory being coupled to at least one of the output 
ports and a plurality of the input ports and each 
switch memory comprising: 

a plurality of switch memory packet receiv- 
ing means (401 (i,0..13)) for receiving packets 
from the input ports to which the switch memory 
is coupled, 

a switch memory packet outputting means 
(401 (i,0..11)) for outputting packets to the output 
port to which the switch memory is coupled, 

switch memory packet accepting means 
(533) coupled to the switch memory packet re- 
ceiving means for accepting only those packets 
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received in the switch memory packet receiving 
means which are to be output to the output port 
to which the switch memory is coupled, 

switch memory packet memory means 5 
(407(i)) for storing the accepted packets, and 

switch memory packet transfer means 
(403) for transferring the accepted packets in 
parallel between the plurality of switch memory 
packet receiving means, the switch memory 10 
packet memory means, and the switch memory 
packet outputting means, 

and wherein 
in the plurality of switch memories, 
the plurality of switch memory packet receiving is 
means make up the packet receiving means, 
the switch memory packet outputting means 
make up the packet outputting means, the switch 
memory packet memory means make up the 
packet memory means, and the switch memory 20 
packet transfer means make up the packet trans- 
fer means. 

The switch fabric set forth in claim 2 character- 
ized in that: 25 

the packets received by the switch mem- 
ories are received as a sequence of words; 

each switch memory includes a plurality of 
switch memory integrated circuits, (303) each of 
which processes a slice of the words; 30 

each switch memory integrated circuit in- 
cludes 

a plurality of input shift register means 
(401 (0..13)) for serially receiving the slices from 
the plurality of input ports to which the switch 35 
memory is coupled, 

an output shift register means (405) for 
serially outputting slices of the sequences of 
words of accepted packets to the output port to 
which the switch memory is coupled, 40 

slice accepting means coupled to the input 
shift register means (533) for accepting only re- 
ceived slices of accepted packets, 

slice memory means (407) for storing the 
slices of the accepted packets, 45 

slice transfer means (403) for transferring 
the slices of the accepted packets in parallel be- 
tween the input shift register means, the output 
shift register means, and the slice memory 
means, and so 

in the plurality of switch memory integrat- 
ed circuits, 

the input shift register means make up the switch 

memory packet receiving means, 

the output shift register means make up the 55 

switch memory packet outputting means, 

the slice accepting means make up the packet 

accepting means, 

the slice memory means make up the switch 



memory packet memory means, and 

the slice transfer means make up the switch 

memory transfer means. 

4. The switch fabric set forth in claim 1 character- 
ized in that: 

the rate at which any of the packet receiv- 
ing means receives the packets is independent of 
the rates at which the other packet receiving 
means receive the packets. 

5. The switch fabric set forth in claim 1 character- 
ized in that: 

the rate at which any of the packet output- 
ting means outputs the packets is independent of 
the rates at which the other packet outputting 
means output the packets. 

6. The switch fabric set forth in claim 1 character- 
ized in that: 

the switch fabric further comprises packet 
accepting means (533) for accepting received 
packets for output to the output ports as required 
for the packets' destinations. 

7. The switch fabric set forth in claim 1 further char- 
acterized by: 

means (403) for moving packets stored in 
the packet memory means in parallel between lo- 
cations in the packet memory means. 

8. The switch fabric set forth in claim 1 character- 
ized in that: 

the switch fabric operates in a first mode 
wherein the packet memory means is a single 
queue. 

9. The switch fabric set forth in claim 8 character- 
ized in that: 

the switch fabric further operates in a sec- 
ond mode wherein the packet memory means 
contains a plurality of queues. 

10. The switch fabric set forth in claim 1 further char- 
acterized by: 

means for modifying the manner in which 
the packets are stored in the packet receiving 
means and/or the manner in which the packets 
are stored in the packet outputting means. 
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(54) High-speed packet switch 



(57) A packet switch of the type in which packets 
received in the switch are stored in memory until they 
are output. In the switch fabric of the switch, packets are 
serially received in input shift registers (401) wide 
enough to store an entire packet, output in parallel to 
memory (407) which is as wide as the input shift register, 
moved in parallel in the memory, and output in parailo 1 
Eo an output shift register (405). The bus (403) connect- 
ing the input shift registers, the output shift register and 
the memory is as wide as the input shift register, but 
does not cross the boundaries of the semiconductor 
chips making up the switch fabric, thus avoiding the 



electrical problems of very wide buses. In the disclosed 
implementation, there are 14 input lines and 14 output 
- lines. A switch memory is associated with each output 
line and receives packets from all 1 4 input lines ; accept- 
ing only those destined for the output line associated 
with the input line. Each switch memory incfudes a con- 
troller, memory and a communications interface for the 
controller, and a set of switch memory VLSI devices. 
Each switch memory VLSI device includes a first shift 
register for receiving slices of the packet and a bus, a 
memory, and a second shift register for outputting the 
slices. The bus., the memory, and the second shift reg- 
ister are as wide as the first shift register. 
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